AITopics | auxiliary objective

Collaborating Authors

auxiliary objective

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

a0709efe5139939ab69902884ecad9c1-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-19-2026, 09:10:16 GMT

mlr, objective, representation, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

50eb39ab717507cccbe2b8590de32030-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 22:47:07 GMT

A standard solution to speed up the process is to leverage additional reward signals, shaping it to better guide the learning process.

machine learning, reinforcement learning, trajectory, (16 more...)

Neural Information Processing Systems

Country:

North America > Dominican Republic (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.31)

Add feedback

EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL

Neural Information Processing SystemsDec-24-2025, 04:58:37 GMT

Reinforcement learning (RL) in long horizon and sparse reward tasks is notoriously difficult and requires a lot of training steps. A standard solution to speed up the process is to leverage additional reward signals, shaping it to better guide the learning process.In the context of language-conditioned RL, the abstraction and generalisation properties of the language input provide opportunities for more efficient ways of shaping the reward.In this paper, we leverage this idea and propose an automated reward shaping method where the agent extracts auxiliary objectives from the general language goal. These auxiliary objectives use a question generation (QG) and a question answering (QA) system: they consist of questions leading the agent to try to reconstruct partial information about the global goal using its own trajectory.When it succeeds, it receives an intrinsic reward proportional to its confidence in its answer. This incentivizes the agent to generate trajectories which unambiguously explain various aspects of the general language goal.Our experimental study using various BabyAI environments shows that this approach, which does not require engineer intervention to design the auxiliary objectives, improves sample efficiency by effectively directing the exploration.

automatic reward shaping, language-guided rl, name change, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.77)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.60)

Add feedback

Enhancing Compositional Reasoning in CLIP via Reconstruction and Alignment of Text Descriptions

Kwon, Jihoon, Min, Kyle, Sohn, Jy-yong

arXiv.org Artificial IntelligenceOct-21-2025

Despite recent advances, vision-language models trained with standard contrastive objectives still struggle with compositional reasoning -- the ability to understand structured relationships between visual and linguistic elements. This shortcoming is largely due to the tendency of the text encoder to focus on individual words rather than their relations, a limitation reinforced by contrastive training that primarily aligns words with visual objects. In this paper, we introduce REconstruction and Alignment of text Descriptions (READ), a fine-tuning method designed to enhance compositional reasoning by adding two auxiliary objectives to the contrastive learning: (1) a token-level reconstruction objective, where a frozen pre-trained decoder reconstructs alternative captions based on the embedding of the original caption; and (2) a sentence-level alignment objective, which explicitly aligns paraphrased sentences in the embedding space. We show that READ-CLIP, a model derived by applying the READ method to the pre-trained CLIP model, achieves the state-of-the-art performance across five major compositional reasoning benchmarks, outperforming the strongest conventional fine-tuning baseline by up to 4.1%. Furthermore, applying the READ to existing CLIP variants (including NegCLIP and FSC-CLIP) also improves performance on these benchmarks. Quantitative and qualitative analyses reveal that our proposed objectives -- reconstruction and alignment -- offer complementary benefits: the former encourages the encoder to capture relationships between words within a caption, while the latter ensures consistent representations for paraphrases expressed with different wording.

caption, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.1654

Country:

Europe > Switzerland (0.28)
Asia > Middle East > UAE (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Mask-based Latent Reconstruction for Reinforcement Learning (Appendix)

Neural Information Processing SystemsAug-17-2025, 07:34:04 GMT

This work was done when Tao Y u was an intern at Microsoft Research Asia.

artificial intelligence, machine learning, reinforcement learning, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois > Cook County > Chicago (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Leisure & Entertainment (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

This supplementary material provides additional results and discussion as well as implementation

Neural Information Processing SystemsAug-14-2025, 19:58:00 GMT

Gaussian processes for machine learning.

agent, auxiliary objective, trajectory, (12 more...)

Neural Information Processing Systems

Genre: Research Report (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

EAGER: Asking and Answering Questions for Automatic Reward Shaping in Language-guided RL

Neural Information Processing SystemsAug-14-2025, 19:57:56 GMT

To tackle the reward sparsity issue, one idea is to densify the reward by decomposing the general goal into sub-goals and rewarding them individually.

agent, auxiliary objective, trajectory, (12 more...)

Neural Information Processing Systems

Country:

North America > Dominican Republic (0.04)
North America > Canada > Quebec > Montreal (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre:

Workflow (0.93)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Unlearning Works Better Than You Think: Local Reinforcement-Based Selection of Auxiliary Objectives

Bendahi, Abderrahim, Fradin, Adrien, Lerasle, Matthieu

arXiv.org Machine LearningApr-19-2025

We introduce Local Reinforcement-Based Selection of Auxiliary Objectives (LRSAO), a novel approach that selects auxiliary objectives using reinforcement learning (RL) to support the optimization process of an evolutionary algorithm (EA) as in EA+RL framework and furthermore incorporates the ability to unlearn previously used objectives. By modifying the reward mechanism to penalize moves that do no increase the fitness value and relying on the local auxiliary objectives, LRSAO dynamically adapts its selection strategy to optimize performance according to the landscape and unlearn previous objectives when necessary. We analyze and evaluate LRSAO on the black-box complexity version of the non-monotonic Jump function, with gap parameter $\ell$, where each auxiliary objective is beneficial at specific stages of optimization. The Jump function is hard to optimize for evolutionary-based algorithms and the best-known complexity for reinforcement-based selection on Jump was $O(n^2 \log(n) / \ell)$. Our approach improves over this result to achieve a complexity of $\Theta(n^2 / \ell^2 + n \log(n))$ resulting in a significant improvement, which demonstrates the efficiency and adaptability of LRSAO, highlighting its potential to outperform traditional methods in complex optimization scenarios.

evolutionary algorithm, machine learning, reinforcement learning, (19 more...)

arXiv.org Machine Learning

2504.14418

Country:

Europe > Spain > Andalusia > Málaga Province > Málaga (0.05)
North America > United States > New York > New York County > New York City (0.04)
Europe > France (0.04)
(4 more...)

Genre: Research Report (0.84)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)

Add feedback

RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning

Gao, Hao, Chen, Shaoyu, Jiang, Bo, Liao, Bencheng, Shi, Yiang, Guo, Xiaoyang, Pu, Yuechuan, Yin, Haoran, Li, Xiangyu, Zhang, Xinbang, Zhang, Ying, Liu, Wenyu, Zhang, Qian, Wang, Xinggang

arXiv.org Artificial IntelligenceFeb-18-2025

Existing end-to-end autonomous driving (AD) algorithms typically follow the Imitation Learning (IL) paradigm, which faces challenges such as causal confusion and the open-loop gap. In this work, we establish a 3DGS-based closed-loop Reinforcement Learning (RL) training paradigm. By leveraging 3DGS techniques, we construct a photorealistic digital replica of the real physical world, enabling the AD policy to extensively explore the state space and learn to handle out-of-distribution scenarios through large-scale trial and error. To enhance safety, we design specialized rewards that guide the policy to effectively respond to safety-critical events and understand real-world causal relationships. For better alignment with human driving behavior, IL is incorporated into RL training as a regularization term. We introduce a closed-loop evaluation benchmark consisting of diverse, previously unseen 3DGS environments. Compared to IL-based methods, RAD achieves stronger performance in most closed-loop metrics, especially 3x lower collision rate. Abundant closed-loop results are presented at https://hgao-cv.github.io/RAD.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2502.13144

Genre: Research Report (0.64)

Industry:

Energy (0.96)
Transportation > Ground > Road (0.36)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

HyperDPO: Conditioned One-Shot Multi-Objective Fine-Tuning Framework

Ren, Yinuo, Xiao, Tesi, Shavlovsky, Michael, Ying, Lexing, Rahmanian, Holakou

arXiv.org Artificial IntelligenceDec-3-2024

In LLM alignment and many other ML applications, one often faces the Multi-Objective Fine-Tuning (MOFT) problem, i.e. fine-tuning an existing model with datasets labeled w.r.t. different objectives simultaneously. To address the challenge, we propose the HyperDPO framework, a conditioned one-shot fine-tuning approach that extends the Direct Preference Optimization (DPO) technique, originally developed for efficient LLM alignment with preference data, to accommodate the MOFT settings. By substituting the Bradley-Terry-Luce model in DPO with the Plackett-Luce model, our framework is capable of handling a wide range of MOFT tasks that involve listwise ranking datasets. Compared with previous approaches, HyperDPO enjoys an efficient one-shot training process for profiling the Pareto front of auxiliary objectives, and offers post-training control over trade-offs. Additionally, we propose a novel Hyper Prompt Tuning design, that conveys continuous importance weight across objectives to transformer-based models without altering their architecture, and investigate the potential of temperature-conditioned networks for enhancing the flexibility of post-training control. We demonstrate the effectiveness and efficiency of the HyperDPO framework through its applications to various tasks, including Learning-to-Rank (LTR) and LLM alignment, highlighting its viability for large-scale ML deployments.

hyperdpo framework, objective, pareto front, (12 more...)

arXiv.org Artificial Intelligence

2410.08316

Country: North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback